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Introduc  tion 


This  report  describes  for  the  potential  user  a  set  of  procedures  for 
processing  textual  materials  on-line.  The  present  system,  operational  on 
the  IBM  7030  computer,  is  a  preliminary  model  of  a  more  sophisticated  sys¬ 
tem  currently  under  development  (Walker,  1967) .  In  the  preliminary  model 
an  information  analyst  can  scan  through  messages,  reports,  and  other  docu¬ 
ments  on  a  display  scope  and  select  relevant  facts,  which  are  processed 
linguistically  and  then  stored  in  the  computer  in  the  form  of  logical  con¬ 
tent  representations.  To  satisfy  a  specific  requirement  the  analyst  can 
initiate  a  search  through  the  stored  representations  to  identify  those 
with  relevant  content.  Once  identified,  the  original  texts  from  which 
those  representations  were  derived  can  be  recovered,  and,  with  suitable 
on-line  editing  and  the  addition  of  commentary  and  interpretation  by  the 
analyst,  formed  directly  into  reports  on  the  topics  under  consideration. 

The  model  SAFARI  system  can  conveniently  be  regarded  as  having  two 
phases,  representation  and  recovery.  The  input  to  the  representation  phase 
is  raw  text--a  message,  a  news  article,  a  report,  etc.  After  it  has  been 
read  into  the  computer,  the  text  is  available  for  display  on  a  CRT  to  the 
analyst,  who  reads  the  text  and  extracts  from  it  the  salient  facts  which  he 
wishes  to  store.  He  then  phrases  these  facts  in  simple  English  declarative 
sentences,  which  are  called  "kernels",  and  submits  them  to  the  machine  for 
processing  and  storage.  A  kernel  may  be  constructed  on  the  display  by  light¬ 
gunning,  a  word  at  a  time,  the  component  words  from  the  displayed  raw  text, 
using  the  typewriter  to  introduce  desired  words  not  present  in  that  text. 
Alternatively,  the  kernel  itself  may  be  typed  in  in  its  entirety.  When  a 
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kernel  is  complete,  the  analyst  lightguns  the  command  "process",  and  pro¬ 
grams  for  syntactic  analysis  and  semantic  interpretation  convert  the  sen¬ 
tence  automatically  into  a  content  representation.  The  analyst  then  can 
begin  to  construct  another  kernel. 

If  for  any  reason  a  kernel  cannot  be  processed,  the  system  provides 
the  analyst  both  with  information  about  the  nature  of  the  problem  and  with 
the  basis  for  correcting  it.  The  nature  of  this  interaction  between  the 
analyst  and  the  system  can  be  appreciated  best  in  the  context  of  a  more 
detailed  description  of  the  procedures  involved. 

The  first  step  in  the  machine  processing  of  a  kernel  is  a  lexical 
lookup  of  every  word  in  the  kernel.  If  any  words  not  in  the  machine's 
current  lexicon  are  encountered,  the  analyst  is  asked  to  classify  them.. 

At  present,  the  classification  is  merely  par t-of-speech  assignment  as 
noun,  verb  or  adjective.  The  list  of  classes  is  displayed,  and  the 
appropriate  one  lightgunned.  This  information  is  internalized  by  the 
machine  and  retained  for  future  reference.  When  it  has  classifications 
for  all  the  words  in  a  kernel,  the  machine  attempts  to  parse  it  accord¬ 
ing  to  the  grammar  of  the  system.  If  there  are  no  parsings,  it  can  be 
for  one  of  two  reasons.  Either  the  kernel  is  simply  not  acceptable  to 
the  grammar,  in  which  case  it  must  be  rephrased,  or  there  is  a  wrong 
lexical  classification  in  terms  of  the  current  kernel  (for  example, 
supply  may  be  listed  in  the  lexicon  only  as  a  verb,  while  in  the  current 
kernel  it  is  being  used  as  a  noun) .  To  ascertain  whether  the  latter  is 
the  case,  the  machine  displays  all  of  the  words  in  the  kernel  with  their 
associated  classifications.  The  analyst  scans  this  list,  and,  if  a 


-3- 


c lassif ication  is  incomplete,  he  supplies  the  necessary  additional  information. 
The  lexicon  is  updated  and  the  kernel  reprocessed. 

If  more  than  one  parsing  is  obtained,  the  kernel  is  ambiguous  and  the 
analyst  is  informed  of  this  fact  so  that  he  may  rephrase. 

The  optimum  and  most  usual  case  is  that  exactly  one  parsing  is  obtained. 
The  machine  makes  a  semantic  interpretation  of  the  kernel  based  on  its  gram¬ 
matical  structure.  This  interpretation  at  present  consists  of  determining 
the  activity  expressed  in  the  sentence,  the  subject,  object,  place  and  time 
of  the  activity,  and  any  property  or  indication  of  quantity  associated  with 
the  subject,  object,  and  place.  These  facts  are  structured  into  a  logical 
form  which  is  stored  as  the  content  representation  of  the  kernel. 

The  recovery  phase  begins  with  the  analyst  informing  the  machine  that 
he  wishes  to  make  a  query.  The  machine  displays  a  list  of  categories  cor¬ 
responding  to  the  kinds  of  information  it  possesses  (those  identified  in 
the  previous  paragraph) .  The  analyst  enters  in  this  display,  via  type¬ 
writer  and  lightgun,  the  particular  things  he  wishes  to  find  out  more  about, 
with  query  indicators  on  the  categories  he  desires  the  machine  to  fill  in. 

For  example,  if  he  wishes  to  find  out  the  actions  taken  by  American  planes 
during  July,  he  enters  by  SUBJECT  planes ,  by  PROPERTY  (of  subject)  American , 
by  TIME  July  or  during  July,  and  by  ACTION  a  (query) .  The  machine  searches 
its  store  of  content  representations  of  previously  input  kernel  sentences. 

If  any  meet  the  specifications,  they  are  searched  in  the  queried  categories. 

The  information  obtained  is  displayed,  appropriately  labeled,  below  the 
still-showing  query  list.  (If  none  meet  the  specifications,  the  analyst  is 
informed  of  this) .  Confronted  with  the  new  information,  the  analyst  may 
sharpen,  broaden,  or  change  his  query  as  he  likes,  and  the  machine  will 
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recompute  the  information  accordingly. 

The  analyst  may  at  this  point  be  satisfied  with  what  he  has  learned. 

He  may,  on  the  other  hand,  want  to  see  the  context  of  some  just-learned 
fact.  By  lightgunning  any  of  the  displayed  answers,  he  can  retrieve  a 
display  of  the  original  section  of  raw  text  from  which  the  kernel  yield¬ 
ing  that  answer  was  derived.  Editing  operations  can  then  be  performed  on 
this  displayed  text  to  reduce  it  to  its  essential  elements  which  may  be 
stored  with  related  portions  of  text.  The  accumulations  of  related  text 
entries  can  themselves  be  displayed  and  edited;  commentaries  or  inter¬ 
pretations  can  be  inserted  by  the  analyst;  and  then  the  result  can  be 
printed  out  on-  or  off-line  for  report  generation. 

Flow  charts  for  the  representation  and  recovery  phases  of  the  pre¬ 
liminary  model  are  presented  in  Figure  1.  A  film  illustrating  the  operation 
of  the  system  has  been  prepared  (Chapin,  1967) . 

The  SAFARI  text  processing  system  is  programmed  in  TREET,  a  list¬ 
processing  programming  system  adapted  for  on-line  use  in  a  display  environ¬ 
ment  and  enabling  facile  manipulation  of  textual  data  through  the  use  of 
lightgun  and  on-line  typewriter  (Haines,  1965,  1967).  Most  of  the  program¬ 
ming  for  the  system  was  done  on-line  using  the  "On-line  Programming  System" 
developed  at  MITRE  (Gross,  1967).  SAFARI  has  been  implemented  on  the  IBM 
7030  computer  with  DD-13  display  consoles.  An  augmented  version  of  the 
system  is  being  reprogrammed  for  an  IBM  System/360  facility. 

In  SAFARI  the  display  scope  is  divided  functionally  into  four  sec¬ 
tions.  The  center,  comprising  the  major  portion  of  the  scope,  is  used 
primarily  for  displaying  text,  although  it  will  contain  lists  of  grammatical 


5 


i - 1 


<u 

IS~> 

<x> 

o, 

q_> 


CD 

> 

O 

o 

CD 

cc: 


Figure  1.  Flow  charts  for  the  preliminary 

model  of  a  text-processing  system.  (Dashed 
lines  indicate  operator  interaction.) 
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categories  or  of  lexical  entries  under  the  circumstances  described  above. 
The  center  also  can  be  used  to  display  tree  structures  representing  either 
the  results  of  the  syntactic  parsing  of  the  kernel  or  its  content  repre¬ 
sentation.  The  left  margin  contains  alternative  sets  of  frequently  used 
control  words  which  can  be  activated  by  lightgun.  The  top  margin  is  used 
to  identify  the  mode  in  which  the  system  is  currently  operating  or  to  in¬ 
dicate  the  action  that  would  be  initiated  by  the  next  input  to  the  system. 
The  bottom  margin  is  used  to  display  words  not  currently  defined  in  the 
lexicon . 
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I  *  Representation  Phase 

A.  Entry 

The  system  is  in  edit  mode  initially  (cf.  the  On-Line  Programming 
System:  User's  Manual  (Gross,  1967)  for  a  description).  Text  is  called 
up  by  typing  EDIT  (name)  where  name  contains  textual  data  stored  as  a 
series  of  cards  each  of  which  contains  no  more  than  64  characters.  The 
number  of  cards  is  limited  only  by  data  storage  capacity.  The  display 
contains  up  to  30  cards  of  text  (a  card  is  equivalent  to  one  line). 
Kernels  are  constructed  from  a  set  of  ten  cards.  If  the  portion  of  text 
being  displayed  is  exactly  ten  cards  long,  typing  KERNEL  will  initiate 
the  kernelizing  mode,  and  CONSTRUCT  will  appear  in  the  top  margin  of  the 
display.  If  more  than  ten  cards  are  displayed,  typing  GULP  will  cause 
SELECT  to  appear  in  the  top  margin,  after  which  one  of  the  cards  may  be 
lightgunned.  This  card  will  be  displayed  as  the  first  of  ten  cards,  the 
kernelizing  mode  will  be  initiated,  and  CONSTRUCT  will  appear  in  the  top 
margin . 

B.  Constructing  the  Kernel 

In  the  kernelizing  mode,  any  words  lightgunned  will  be  displayed 
successively  on  a  line  below  the  text.  The  words  may  be  selected  in  any 
order  irrespective  of  their  position  in  the  text.  Words  also  can  be 
entered  from  the  typewriter  after  the  left-margin  control  word  TYPIN 
has  been  lightgunned,  as  described  in  the  next  section. 

C.  Left-Margin  Control  Words  in  Kernelizing  Mode 

During  the  kernelizing  mode,  the  left  margin  contains  the  follow¬ 


ing  control  words : 
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NEWKERNEL 

CRG-RET 

TYPIN 

FIRSTPG 

NEXTPG 

PREVTG 

CONSTRUCT 

PROCESS 

QUERY 

EDIT 

These  control  words  may  be  lightgunned  to  perform  the  following 

actions : 

1.  NEWKERNEL  -  deletes  any  kernel  or  part  of  kernel  present  and 

initializes  the  system  for  constructing  a  new  one. 

2.  CRG-RET  -  (carriage  return)  provides  that  the  next  addition 

to  the  kernel  will  be  on  a  new  card. 

3.  TYPIN  ---  provides  that  the  next  addition  to  the  kernel  will  be 
entered  from  the  typewriter.  If  the  typewriter  addition  contains  enough  words 
to  carry  the  kernel  onto  a  new  card,  the  kernel  should  be  completed  from  this 
same  typewriter  entry  (this  limitation  is  temporary) .  A  whole  kernel  can  be 
composed  from  the  typewriter.  After  the  words  are  added  to  the  kernel, 
further  additions  to  the  kernel  may  be  made  with  the  lightgun. 

4.  FIRSTPG  -  changes  the  display  of  cards  so  that  the  first  card 

of  the  text  becomes  the  top  card  being  displayed;  any  kernel  on  the  display 
scope  is  unaffected.  (See  note  after  6  below.) 

5.  NEXTPG  -  changes  the  display  of  cards  so  that  the  card  second 

from  the  bottom  becomes  the  new  top  card;  any  kernel  on  the  display  scope 
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is  unaffected.  (See  note  after  6  below.) 

6.  PREVPG  -  changes  the  display  of  cards  so  that  the  card 

which  previously  was  the  top  card  of  the  display  becomes  the  new  top 
card;  this  action  essentially  undoes  a  NEXTPG  action;  any  kernel  on  the 
display  scope  is  unaffected.  (See  note  below.) 

Note:  FIRSTPG,  NEXTPG,  and  PREVPG  are  analogous  to  the  control  words 
FIRSTPAGE,  NEXTPAGE ,  and  PREVPAGE  described  in  the  On-Line  Programming 
System:  User's  Manual  except  that 

a)  ten  cards  of  text  are  displayed  at  all  times  with 
an  overlap  between  pages  of  two  cards;  if  NEXTPG  is 
selected  and  there  are  not  ten  more  cards  of  text 
(beginning  with  the  ninth  card  of  the  current  display) , 
the  remaining  positions  will  be  filled  with  blank  cards; 
if  PREVPG  is  selected  and  there  are  not  eight  cards 

of  text  preceding  the  first  card  displayed,  the  first 
ten  lines  of  the  text  will  be  displayed; 

b)  FIRSTPG  is  defined  by  the  cards  of  text  displayed  when 
the  kernelizing  mode  was  entered  after  typing  KERNEL  or 
by  the  cards  selected  following  GULP  £r  by  the  cards  dis¬ 
played  at  the  time  NEWKERNEL  is  lightgunned. 

7.  CONSTRUCT  -  provides  that  any  word  of  the  text  subsequently 

lightgunned  will  be  added  to  the  kernel  being  constructed.  CONSTRUCT  must 
be  used  if,  during  the  course  of  constructing  a  kernel,  editing  functions 
which  involve  lightgunning  a  line  have  been  used. 
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8.  PROCESS  -  causes  the  kernel  which  has  been  constructed  to 

be  submitted  for  processing;  for  a  description  of  the  possible  outcomes 
following  PROCESS,  see  Section  D.  below. 

9.  QUERY  -  causes  the  query  mode  to  be  initialized  with  the 

appropriate  display  as  described  in  Section  II. A.  below. 

10.  EDIT  -  returns  the  system  to  edit  mode,  the  normal  mode 

of  operation  for  SAFARI.  The  use  of  EDIT  is  necessary  if  for  any  reason 
the  system  does  not  return  to  edit  mode  following  some  action. 

D.  Processing  the  Kernel 

When  the  processing  of  a  kernel  is  initiated,  four  outcomes  are 
possible: 

1.  One  parsing  for  the  kernel  is  found.  A  content  representation 
is  constructed  and  stored  for  each  conjoined  phrase  in  the  kernel.  A 
carriage  return  on  the  typewriter  indicates  that  the  processing  is  complete. 
No  change  in  the  display  occurs. 

2.  More  than  one  parsing  is  found.  The  response  THE  KERNEL  IS  n 
WAYS  AMBIGUOUS  is  typed  out  where  n  is  the  number  of  parsings  for  the 
kernel.  The  syntactic  structures  of  the  alternatives  may  be  viewed  by 
using  the  function  PTREE(n)  (see  E.4.  below)  where  n  is  the  number  of 

the  alternative  to  be  viewed.  The  kernel  may  be  revised  using  the  function 
ALTER  (see  E.l.  below)  or  erased  by  lightgunning  NEWKERNEL  which  will  allow 
a  new  kernel  to  be  constructed. 

3.  No  parsing  is  found  for  the  kernel,  and  one  or  more  words  in 
the  kernel  are  not  listed  in  the  lexicon.  The  display  is  changed  so  that 
the  bottom  margin  contains  the  statement  PLEASE  CHOOSE  DEFINITION  FOR  word 
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where  word  is  the  first  word  in  the  kernel  not  in  the  lexicon.  The  center 
of  the  display  contains  a  list  of  the  possible  category  types,  currently 
N,V,  and  ADJ  (for  noun,  verb,  and  adjective).  Lightgunning  one  of  the 
alternatives  results  in  entering  that  word  in  the  lexicon  with  the  specified 
category.  The  response  word  DEFINED  AS  type  will  be  typed  out,  and  the  pro¬ 
cessing  of  the  kernel  will  be  initiated  again. 

4.  No  parsings  are  found  for  the  kernel,  although  entries  for  all 
the  words  are  contained  in  the  lexicon.  The  response  NO  PARSINGS  is  typed 
out,  and  the  display  is  changed  so  that  the  bottom  margin  contains  the 
statement:  LEXICAL  EXPANSIONS  FOR  EACH  WORD,  and  the  center  of  the  display 
contains  each  word  in  the  kernel  together  with  all  the  category  types  listed 
for  it  in  the  lexicon.  The  user  may  now  examine  the  listing  to  determine 
whether  some  words  which  can  belong  to  more  than  one  category  (e.g.?  supply 
can  be  a  noun  or  a  verb)  lack  the  entry  appropriate  for  the  current  use. 
Additional  category  assignments  for  a  word  can  be  made  using  the  function 
LEXADD  (see  E.3  below)  after  which  lightgunning  PROCESS  will  again  initiate 
processing.  If  the  kernel  is  not  acceptable  to  the  grammar,  it  may  be  re¬ 
vised  using  the  ALTER  function  or  erased  by  lightgunning  NEWKERNEL  which 
will  allow  a  new  kernel  to  be  constructed. 

E.  Typewriter  Control  Words  in  Kernelizing  Mode 

1.  ALTER  -  ALTER  can  be  used  to  change  any  part  of  a  kernel.  The 

word  ALTER  is  typed  in,  followed  by  a  carriage  return;  the  format  required 
for  the  rest  of  the  message  is 

V  string  V  newstring  V 

where  y  is  any  A6  character  not  occurring  in  string  or  newstring.  String  and 
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newstring  may  be  of  unequal  length;  news tring  may  be  empty,  but  string 
must  be  non-empty.  The  third  y  is  optional  unless  news tring  ends  with 
a  blank.  After  the  message  is  entered,  the  typewriter  will  execute  a 
carriage  return,  and  ALTER  will  be  displayed  in  the  top  margin.  Subse¬ 
quently,  any  card  lightgunned  will  be  scanned,  from  the  left,  for  the 
first  occurrence  of  string,  which  will  then  be  replaced  by  news tring . 
Successive  lightgunning  will  cause  further  replacements.  If  string  is 
not  a  substring  of  the  characters  on  the  card  selected,  the  response  NO 
MATCH  FOUND  is  typed  out.  If  the  revised  card  contains  more  than  64 
characters  (the  maximum  number  displayable)  but  less  than  80  characters 
(the  maximum  a  card  can  contain) ,  the  response  ALTERED  CARD  NOT  DISPLAYED 
COMPLETELY  is  typed  out.  If  the  revised  card  would  be  over  80  characters, 
the  alteration  is  not  performed,  and  the  response  NEW  CARD  TOO  LONG,  OLD 
CARD  NOT  ALTERED  is  typed  out. 

2.  PARSENT (kernel)  -  kernel  is  any  sequence  of  words  without 

punctuation.  PARSENT  allows  a  kernel  to  be  entered  directly  from  the 
typewriter.  The  kernel  will  be  processed  in  the  same  way  that  the  corres¬ 
ponding  sequence  of  words  on  the  display  would  be  processed.  However,  con¬ 
tent  representations  created  by  using  PARSENT  will  not  be  identified  with 
any  segment  of  text. 

3.  LEXADD(word,  newdef ini tion)  where  word  is  any  word  and 

newdef inition  is  a  lexical  category.  LEXADD  allows  grammatical  categories 
to  be  added  to  the  lexical  entry  for  a  word.  The  response  word  DEFINED  AS 


newdef inition  will  be  typed  out. 
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4.  PTREE(n)  -  where  n  is  the  number  of  the  parsing  for  the 

current  kernel.  PTREE  results  in  a  display  in  tree  format  of  the 
syntactic  structures  for  a  kernel.  Since  PTREE  is  in  TREET  mode  rather 
than  edit  mode  (cf.  On-Line  Programming  System:  User's  Manual),  it  is 
necessary  to  type  E  followed  by  a  carriage  return  before  entering  PTREE. 

5.  LTREE(n)  -  where  n  is  the  number  of  the  content  representation 

for  the  current  kernel.  LTREE  results  in  a  display  of  the  content  repre¬ 
sentations  for  a  kernel  in  tree  format.  Since  LTREE  also  is  in  TREET  mode, 
it  is  necessary  to  type  E  followed  by  a  carriage  return  before  entering 
LTREE. 
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II .  Recovery  Phase 

A.  Constructing  the  Query 

The  query  mode  may  be  entered  by  typing  QUERY  while  in  edit  mode 


or  by  lightgunning  the  control  word  QUERY  in  the  left  margin  of  the 
kernelizing  display.  The  response  SET  is  typed  out;  QUERY  appears  in 
the  top  margin  of  the  display,  the  control  words  are  in  the  left  margin, 
and  the  query  categories  are  in  the  center: 

QUERY 


NEXTPAGE 

FIRSTPAGE 

• 

REQUERY 

PERFORM 

SHOWTEXT 

CONSTRUCT 

EDIT 


ACTION 

SUBJECT 

PROPERTY 

QUANTITY 

OBJECT 

PROPERTY 

QUANTITY 

LOCATION 

PROPERTY 

QUANTITY 

TIME 


Q 


A  query  category  is  specified  by  typing  in  the  appropriate  content  word  or 
the  symbols  NIL  or  Q.  The  typewriter  executes  a  carriage  return  in  response, 
and  the  word  ATTACH  is  displayed  in  the  top  margin.  The  entry  is  then  added 
to  the  right  of  whichever  query  category  is  lightgunned  next,  replacing  any 
prior  entry.  The  same  entry  may  be  used  to  specify  more  than  one  query 
category  by  successive  lightgun  actions.  Attaching  NIL  simply  removes  a 
previous  entry.  Attaching  Q  specifies  the  query  categories  that  are  to  be 
"answered.”  Q  can  also  be  set  by  lightgunning  the  Q  in  the  left  margin. 
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B.  Performing  the  Search 

In  a  search,  the  content  words  from  the  current  query  are  compared 
with  the  words  in  the  corresponding  categories  in  each  of  the  stored  content 
representations.  For  each  match,  the  words  in  the  content  representation 
corresponding  to  the  Q's  in  the  query  are  retrieved.  These  words  are 
listed  in  sequence  in  the  lower  part  of  the  display  center  under  the 
appropriate  category  headings;  the  information  from  each  content  repre¬ 
sentation  is  contained  in  a  separate  line. 

C.  Left- Margin  Control  Words  in  Query  Mode 

These  control  words  may  be  lightgunned  to  perform  the  following 

actions: 

1.  NEXTPAGE  if  the  list  of  answers  following  PERFORM  exceeds 

17,  the  number  that  will  fit  in  the  space  below  the  query  categories,  light¬ 
gunning  NEXTPAGE  will  cause  the  next  17  answers  to  be  displayed. 

2.  FIRSTPAGE  -  returns  the  display  of  answers  to  the  beginning 

of  the  list. 

3.  REQUERY  -  resets  the  typewriter  for  input  of  a  query  if  EDIT 

or  PERFORM  has  been  used. 

4.  PERFORM  -  initiates  the  search  procedure;  the  answers  are  dis¬ 

played  in  lines  below  the  set  of  query  categories.  If  there  are  no  answers, 
the  response  NO  MATCH  is  typed  out. 

5.  SHOWTEXT  -  after  the  word  SELECT  is  displayed  in  the  top  margin, 

lightgunning  any  of  the  answers  will  result  in  a  recovery  of  the  passage  of 
text  from  which  the  kernel  of  the  corresponding  content  representation  was 
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constructed.  This  text  will  be  displayed  in  the  select  mode  (See  E.  below) . 

6.  CONSTRUCT  -  initializes  the  kernelizing  mode,  displaying  what¬ 

ever  text  and  kernel  were  shown  when  last  in  that  mode. 

7.  EDIT  -  returns  the  system  to  edit  mode,  the  normal  mode  of 

operation  for  SAFARI.  (See  Section  I.C.10  above.) 

8.  Q  -  causes  a  Q  to  be  attached  to  the  next  query  category  light- 

gunned;  this  action  is  equivalent  to  typing  a  Q. 

D.  Typewriter  Control  Word  in  Query  Mode 

1.  SEARCH  -  SEARCH  allows  a  query  to  be  entered  directly  from  the 

typewriter.  The  word  SEARCH  is  typed  in,  followed  by  a  carriage  return. 

The  next  line  must  contain  a  list  of  eleven  entries  typed  in  without  punc¬ 
tuation.  The  list  will  be  the  same  as  the  entries  in  the  corresponding 
display  version  except  that  NIL  must  be  used  if  there  is  no  other  specifi¬ 
cation.  After  the  search  is  performed,  the  appropriate  answers  will  be 
displayed,  without  the  query  categories  or  any  headings, in  the  center  of 
the  scope.  SHOWTEXT  can  be  used  following  SEARCH  to  recover  the  text 
corresponding  to  a  content  representation. 

E.  Selecting  Text 

The  select  mode  is  entered  only  by  first  lightgunning  SHOWTEXT  in 
the  left  margin  of  the  query  display  and  then  lightgunning  one  of  the  answers 
produced  in  response  to  a  query.  The  center  of  the  display  contains  the 
passage  of  text  from  which  the  kernel  of  the  content  representation  corre¬ 
sponding  to  that  answer  was  constructed.  The  text  can  now  be  edited, 
annotated,  and  processed  using  the  margin  and  typewriter  control  words 


described  below. 
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F.  Left  Margin  Control  Words  in  Select  Mode 

During  the  select  mode,  the  left  margin  contains  the  following 
control  words  : 

SEL-TEXT 
PUT- INFO 

• 

DISP-INFO 
D ISP -TEXT 
<* 

PRINT 

DELETEN 

• 

ANSWERS 

CONSTRUCT 

EDIT 

These  control  words  may  be  lightgunned  to  perform  the  following 

actions: 

1.  SEL-TEXT  ---  allows  a  particular  passage  to  be  selected  from 
the  text.  After  the  word  SEL-FROM  is  displayed  in  the  top  margin,  when  a 
word  is  lightgunned,  all  of  the  words  preceding  it  in  the  displayed  text 
will  be  deleted,  and  SEL--TO  will  be  displayed  in  the  top  margin.  Subse- 
quently  when  a  word  is  lightgunned  all  of  the  words  in  the  displayed  text 
following  it  will  be  deleted. 

2.  PUT-INFO  -  causes  the  text  currently  displayed  to  be  stored 

for  possible  use  in  report  generation.  The  typewriter  performs  a  carriage 
return  as  a  response. 

3.  DISP-INFO  -  the  text  passages  selected  and  stored  by  using 

the  control  word  PUT- INFO  are  displayed. 

4.  DISP-TEXT  -  the  passage  of  text  corresponding  to  the  answer 

lightgunned  immediately  prior  to  entering  the  select  mode  is  redisplayed, 
replacing  whatever  is  in  the  center  of  the  scope. 
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5.  PRINT  -  the  text  currently  displayed  is  printed  on  the 

higher  speed  printer. 

6.  DELETEN  ---  the  word  DLET-FROM  is  displayed  in  the  top 
margin.  After  a  line  is  lightgunned,  DLET-TO  is  displayed  in  the  top 
margin.  After  another  line,  which  is  not  before  the  first  line,  is 
lightgunned,  all  of  the  lines  between  and  including  the  lightgunned 
lines  will  be  deleted.  If  the  second  line  lightgunned  is  before  the 
first  line,  the  response  TOO  HIGH  will  be  typed  out. 

7.  ANSWERS  -  returns  the  system  to  the  query  mode  as  if  SHOW- 

TEXT  has  just  been  lightgunned;  the  display  contains  whatever  query  and 
answers  were  present  prior  to  last  entering  the  select  mode.  Lightgun¬ 
ning  another  answer  will  result  in  a  display  of  the  corresponding  text. 

8.  CONSTRUCT  -  initializes  the  kernelizing  mode,  displaying 

whatever  text  and  kernel  were  present  when  last  in  that  mode. 

9.  EDIT  -  returns  the  system  to  edit  mode,  the  normal  mode  of 

operation  for  SAFARI.  (See  Section  I.  C.  10  above.) 

G.  Typewriter  Control  Words  in  Select  Mode 

The  following  functions  are  available  to  operate  on  the  texts 
displayed  in  select  mode: 

1.  CHANGE  replaces  a  card  by  one  or  more  cards.  CHANGE  is 

typed  in  followed  by  a  carriage  return;  then  the  new  material  is  entered. 
After  a  carriage  return  as  response,  lightgunning  one  of  the  cards  on  the 
display  will  result  in  that  card's  being  erased  and  replaced  by  the  new 


material . 
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2.  INSERT  -  inserts  new  cards  into  a  text  display.  INSERT  is 

typed  in  followed  by  a  carriage  return;  then  the  new  material  is  entered. 
After  a  carriage  return  as  response,  lightgunning  one  of  the  cards  on  the 
display  will  result  in  the  insertion  of  the  new  material  after  that  card. 

3.  ALTER  -  used  to  change  any  part  of  a  card.  (See  I.  E.  1 

above.) 

4.  STORE  name  -  where  name  is  any  legal  TREET  symbol  (cf. 

The  TREET  Programming  System) .  A  copy  of  the  text  currently  displayed 
will  be  stored  in  name  and  the  response  CARDS  STORED  IN  name  will  be 
typed  out.  The  text  can  be  recovered  and  redisplayed  by  typing  E,  a 
carriage  return,  and  EDIT  (name). 

5.  PUNCH  -  the  text  currently  displayed  is  punched  out  off¬ 

line  as  cards,  and  the  response  PUNCHED  is  typed  out. 

6.  DISK  n  ---  the  text  currently  displayed  is  stored  on  the 
disk  in  the  arc  numbered  n.  The  maximum  number  of  cards  that  can  be 
stored  in  an  arc  is  63.  In  response,  ARC  n  WRITTEN  is  typed  out. 
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APPENDIX  A 


Major  SAFARI  Functions 

BLLFT  (TAP)  Called  by  SLCTA  in  select  mode. 

User  has  just  lightgunned  a  word  in  a  displayed  card.  BLLFT  replaces 
all  characters  of  the  card  to  the  left  of  that  word  by  blanks.  (A  word  is 
delimited  on  the  left  by  a  blank,  a  left  parenthesis,  or  a  quotation  mark.) 

BLRT  (TAP)  Called  by  SLCTB  in  select  mode. 

User  has  just  lightgunned  a  word  in  a  displayed  card.  BLRT  replaces 

all  characters  of  the  card  to  the  right  of  that  word  by  blanks.  (A  word 

is  delimited  on  the  right  by  a  blank,  a  right  parenthesis,  a  quotation  mark, 
or  a  comma.  Periods  are  not  replaced  so  that  sentence  boundaries  will  be 
preserved.) 

CNSTRCT  (TREET)  Called  by  left-margin  control  word  CONSTRUCT  in  kerneliz- 

ing,  query,  and  select  modes. 

Indirectly  calls  CSTRGUN. 

CSTRGUN  (TAP)  Called  indirectly  by  CNSTRCT,  KERNEL,  NKER ,  AND  TYPINA  in 

kernelizing  mode. 

User  has  just  lightgunned  a  word  in  a  display.  CSTRGUN  adds  this  word 

to  a  kernel  being  constructed  by  use  of  ALTER.  (See  descriptions  of  BLLFT 

and  BLRT  for  specification  of  word  delimiters.)  ALTER  changes  a  string  of 
10  blanks  to  a  string  consisting  of  one  blank  and  the  word  lightgunned. 
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CVCL  (TREET)  utility  function. 

Argument  is  a  list  of  cards.  CVCL  reads  the  cards  and  converts  the 
contents  of  these  cards  to  a  list. 

CVWCL  (TREET)  utility  function. 

Argument  is  a  list  of  cards.  CVWCL  bounds  this  list  by  cards  containing 
a  left  parenthesis  and  a  right  parenthesis  respectively,  then  calls  CVCL. 

DEXT  (TREET)  called  by  left  margin  control  word  DISP-TEXT  in  select  mode. 

DINF  (TREET)  called  by  left  margin  control  word  DISP-INFO  in  select  mode. 

FINDPARS  (TREET)  called  by  PARSENT. 

Argument  is  the  kernel  displayed.  FINDPARS  calls  the  routines  which 
do  the  actual  parsing  of  the  kernel.  (A  detailed  account  of  the  parsing 
algorithm  may  be  found  in  D.  E.  Walker,  P.  G.  Chapin,  M.  L.  Geis,  L.  N.  Gross, 
Recent  Developments  in  the  MITRE  Syntactic  Analysis  Procedure;  MITRE  Corpor¬ 
ation,  MTP-11,  1966.)  If  the  kernel  cannot  be  parsed  because  of  an  undefined 
lexical  item,  FIXLEX  is  called.  Returns  to  PARSENT. 

FIXLEX  (TREET)  called  by  parsing  routines  through  FINDPARS. 

Displays  grammatical  category  alternatives  for  words  not  in  the  lexicon. 


After  the  user  has  lightgunned  an  alternative,  parsing  is  resumed. 


-23- 


FQUERY  (TREET)  called  by  left  margin  control  word  QUERY  in  kernelizing 

mode . 

Calls  QUERY. 

FRSTPG  (TREET)  called  by  left-margin  control  word  FIRSTPG  in  kerneliz¬ 
ing  mode. 

See  Diagram  1. 

FUDGTOP  (TREET)  called  by  PARSENT 

Argument  is  a  parsing  of  the  kernel  in  tree  representation.  FUDGTOP 
(a  group  of  functions)  searches  the  tree  for  elements  necessary  to  con¬ 
struct  a  list  of  content  representations.  FUDGTOP  is  called  for  each 
parsing.  Returns  to  PARSENT. 

GULP  (TREET)  typed  in  while  in  edit  mode 

Adds  backward  pointers  to  each  line  (card)  of  displayed  text  and  readies 
system  for  selection  of  10  lines  by  use  of  SETGULP .  (For  use  of  backward 
pointers,  see  Diagram  1).  Indirectly  calls  SETGULP. 

KERNEL  (TREET)  called  by  SETGULP  or  typed  in  while  in  edit  mode. 
Initializes  the  kernelizing  mode.  Indirectly  calls  CSTRGUN. 

LEXADD  (TREET)  typed  in  while  in  edit  mode. 

Two  arguments  are  entered,  an  item  name  and  a  lexical  category;  the  new 
category  is  added  to  the  list  of  entries  for  that  item  name  in  the  lexicon. 
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LMP  (TREET)  called  by  left-margin  control  word  PRINT  in  select  mode. 

NKER  (TREET)  called  by  left-margin  control  word  NEWKERNEL  in  kerneliz- 

ing  mode. 

Indirectly  calls  CSTRGUN. 

NLIN  (TREET)  called  by  left-margin  control  word  CRG-RET  in  kernelizing 

mode . 

NXTPG  (TREET)  called  by  left-margin  control  word  NXTPG  in  kernelizing 

mode . 

See  Diagram  1. 

PARSENT  (TREET)  called  by  PROKERN. 

User  has  just  lightgunned  the  left-margin  control  word  PROCESS.  Argu¬ 
ment  is  the  kernel  displayed.  PARSENT  calls  FINDPARS  to  find  all  the  pars¬ 
ings  of  a  sentence.  If  there  are  no  parsings,  PARSENT  calls  TELL  which 
displays  the  lexical  entries  for  each  word.  For  each  parsing  found, 

FUDGTOP  is  called  to  construct  a  list  of  content  representations.  PARSENT 
then  concatenates  this  list  to  a  list  of  all  content  representations  and 
places  a  pointer  to  the  original  text  which  this  content  representation  is 
from. 

PROKERN  (TREET)  called  by  left-margin  control  word  PROCESS  in  kerneliz¬ 
ing  mode. 

Calls  PARSENT  after  using  CVWCL  on  displayed  kernel. 
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PRVPG  (TREET)  called  by  left-margin  control  word  PREVPG  in  kerneliz 

ing  mode . 

See  Diagram  1. 

PTINF  (TREET)  called  by  left-margin  control  word  PUT-INFO  in  select 

mode . 


QUERY  (TREET) 

called  by  left-margin  control  word  QUERY  (via  FQUERY) 

in  kernelizing  mode  or  typed  in  while  in  edit  mode. 

Initializes  the  query  mode.  Indirectly  calls  STMK. 


RE QUERY  (TREET) 

called  by  left-margin  control  word  REQUERY  in  query 

mode . 

Indirectly  calls  STMK. 


RE TANS  (TREET) 

called  by  left-margin  control  word  ANSWERS  in  select 

mode . 

Indirectly  calls  SHOWT. 

SEARCH  (TREET)  typed  in  while  in  edit  mode. 

Argument  input  is  a  list  of  eleven  members  which  is  interpreted  as  a 


query  and  calls  SRCH. 
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SETGULP  (TREET)  called  indirectly  by  GULP. 

User  has  just  lightgunned  a  line  of  displayed  text.  SETGULP  calls 
TENLIN  to  set  up  a  ten-line  segment  of  text  (see  Diag.  1)  and  then  calls 
KERNEL,  which  initializes  the  kernelizing  mode. 

SHOWT  (TREET)  called  indirectly  by  STXT  in  query  mode. 

User  has  just  lightgunned  an  answer.  Using  the  pointer  stored  with 
that  answer,  SHOWT  calls  TENLIN  and  initializes  the  select  mode. 

SLCTA  (TREET)  called  indirectly  by  SLCTC  in  select  mode. 

User  has  just  lightgunned  a  word  of  displayed  text.  SLCTC  deletes 
all  lines  preceding  the  line  gunned  and  calls  BLLFT  to  delete  all  words 
preceding  the  word  gunned.  Indirectly  calls  SLCTB. 

SLCTB  (TREET)  called  indirectly  by  SLCTA  in  select  mode. 

User  has  just  lightgunned  a  word  of  displayed  text.  SLCTB  deletes 
all  lines  following  the  line  gunned  and  calls  BLRT  to  delete  all  words 
following  the  word  gunned. 

SLCTC  (TREET)  called  by  left-margin  control  word  SELTEXT  in  select 

mode . 


Readies  system  for  use  of  SLCTA  and  indirectly  calls  SLCTA. 
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SPLITDC  (TREET)  utility  function. 

Argument  is  a  list  of  10  cards.  SPLITDC  replaces  the  first  10  cards 
of  a  displayed  text  with  its  argument.  FRSTPG,  NXTPG,  and  PRVPG  use 
SPLITDC  to  display  different  ten-line  segments  of  text  without  affecting 
a  kernel  displayed  lower  on  the  scope. 

SRCH  (TREET)  called  by  SEARCH  or  STQRY  in  query  mode. 

Provides  answers  to  a  query.  Its  argument  is  a  list  of  11  members, 
which  is  matched  against  each  content  representation  in  succession. 

See  Diagram  2  and  the  accompanying  description  below. 

STMK  (TREET)  called  indirectly  by  QUERY,  REQUERY,  STMKA  in  query  mode. 

User  has  just  typed  in  a  word  or  a  list  of  two  members  to  be  attached 
to  a  line  of  the  query  display.  STMK  reads  the  input  and  readies  the 
system  for  the  receipt  of  a  lightgun  input  specifying  the  query  line.  In¬ 
directly  calls  STMKA. 


STMKA  (TREET)  called  indirectly  by  STMK  and  STMKQ  in  query  mode. 

User  has  just  lightgunned  a  query  line.  STMKA  attaches  the  input 
read  in  by  STMK  to  that  line.  Indirectly  calls  STMK. 

STMKQ  (TREET)  called  by  left-margin  control  word  Q  in  QUERY  mode. 

STMKQ  simulates  receipt  of  the  input  "Q"  by  STMK  and  indirectly  calls 


STMKA. 
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STQRY  (TREET)  called  by  left-margin  control  word  PERFORM  in  query  mode. 

Prepares,  from  the  filled-in  query  display,  an  11-member  list  suitable 
as  input  for  SRCH.  It  calls  SRCH,  then  displays  the  answers  below  the  query 
display,  under  the  headings  of  the  categories  queried. 

STXT  (TREET)  called  by  left-margin  control  word  SHOWTEXT  in  query  mode. 
Indirectly  calls  SHOWT. 

TELL  (TREET)  called  by  PARSENT 

Displays  the  lexical  entries  for  each  item  in  the  kernel.  The  user  may 
then  change  or  add  to  the  list  of  grammatical  categories  for  an  item  using 
LEXADD . 

TENLIN  (TREET)  utility  function. 

Argument  is  a  pointer  to  a  sublist  of  cards  in  the  original  text. 

TENLIN  creates  a  new  list  of  ten  lines  of  text  starting  at  that  point, 
preserving  backward  pointers  to  the  original  text.  See  Diagram  1. 

TYPINA  (TREET)  called  indirectly  by  TYPINB  in  kernelizing  mode. 

User  has  just  typed  in  a  word  or  words  to  be  added  to  a  kernel  in  the 
process  of  construction.  TYPINA  uses  CVWCL  to  convert  both  the  previous 
and  the  new  inputs  to  lists,  combines  the  two  lists,  and  converts  the  re¬ 
sulting  list  back  to  card  format,  displaying  it  as  the  new  form  of  the 
kernel.  Indirectly  calls  CSTRGUN. 
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TYPINB 


(TREET)  called  by  left-margin  control  word  TYPIN  in  kernelizing 
mode . 


Indirectly  calls  TYPINA. 
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Diagram  1.  Pointers  in  GULP 

Notes:  A.  Arrows  denote  the  usual  forward  pointers  defining  list 

structures . 

B.  Arrows  denote  backward  pointers  which  are  inserted  by  GULP. 

C.  The  backward  pointers  are  preserved  by  TENLIN  when  it  creates 
a  separate  list  of  10  lines  of  the  original  text.  Note  that 
line  n  in  a  10-line  segment  points  back  to  line  n-1  of  the 
original  text.  For  this  reason  all  original  texts  must  begin 


and  end  with  a  blank  card. 
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D.  NXTPG,  PRVPG,  and  FRSTPG  use  the  backward  pointers  to 
return  to  the  original  text,  then  reposition  and  call 
TENLIN.  Similarly,  by  preserving  an  associated  back¬ 
ward  pointer  with  each  content  representation,  a  call 


to  TENLIN  may  be  made  to  recover  text  by  SHOWT . 
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Notes  to  Diagram  2.  SRCH 

The  input  to  SRCH  is  a  list  with  eleven  members,  to  be  matched  against 
the  stored  content  representations,  each  of  which  is  also  an  eleven-member 
list.  The  order  of  the  list  is  as  follows  :  (ACTION  SUBJECT  PROPERTY 
QUANTITY  OBJECT  PROPERTY  QUANTITY  LOCATION  PROPERTY  QUANTITY  TIME) . 

(1)  empties  the  temporary  answer  list.  The  input  query  may  match  a 
content  representation  vacuously,  i.e.,  match  where  the  queries  categories 
are  unspecified  in  the  content  representation.  (For  instance,  the  query 
(BOMBED  JETS  NIL  NIL  SITES  £  NIL  . . .  NIL)  which  asks  "What  types  of  sites 
were  bombed  by  jets?"  will  match  the  representation  (BOMBED  JETS  NIL  NIL 
SITES  NIL  25  NIL  . . .  NIL)  which  corresponds  to  the  kernel  "Jets  bombed  25 
sites"  but  return  no  answers.)  The  flag  FLG  is  used  to  mark  non-vacuous 
cases . 

After  selecting  the  next  content  representation  (2) ,  and  saving  its 
pointer  (3) ,  the  program  begins  to  cycle  through  the  input  query  and  the 
content  representati'.  n  (4),  which,  of  course,  have  the  same  length.  li¬ 
the  next  member  of  the  query  list  is  1 Q1  (5) ,  it  determines  whether  the 
corresponding  member  of  the  representation  is  NIL  or  not  (5A) .  If  that 
member  is  not  NIL,  it  is  added  to  a  temporary  answer  list  (5B) ,  pending 
verification  of  a  complete  match  of  query  with  representation,  and  FLG 
is  turned  on  (5B1) .  Q,  of  course,  matches  any  member  of  a  representation. 
A  NIL  in  a  query  matches  anything  (6) .  Otherwise  the  query  entry  must  be 
identical  to  the  representation  entry  (7) ,  unless  the  representation  entry 
is  a  list  (8)  (a  prepositional  phrase,  e.g.,  (NEAR  HANOI),  or  a  verb  with 
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a  modal,  e.g.,  (CAN  BOMB)).  In  this  case  a  match  occurs  if  the  query 
entry  is  identical  with  the  second  member  of  this  list,  e.g.,  HANOI 
matches  (NEAR  HANOI) .  Note  that  a  query  entry  (NEAR  HANOI)  does  not 
match  a  representation  entry  HANOI.  If  the  two  entries  do  not  match, 
the  representation  is  discarded,  temporary  answers  are  forgotten,  and 
the  next  representation  is  considered. 

If  all  entries  match,  temporary  answers  are  recorded  as  answers  (9) , 
and  FL  is  turned  on  (10)  if  the  answer  was  non-vacuous .  The  next  re¬ 
presentation  is  then  considered. 

When  all  representations  have  been  examined,  the  answer  list  is 
examined  (11) .  If  no  (non-vacuous)  answers  were  found,  the  message  NO 
MATCH  is  typed  out  (11A) .  Otherwise,  the  answers  are  examined  (12) , 
each  answer  is  put  on  a  card,  and  the  card  is  given  a  pointer  to  a  gulp 
of  text,  namely  the  gulp  pointed  to  by  the  representation  giving  rise  to 
that  answer  (13)  .  This  list  of  cards  (14)  is  the  value  of  the  function 


SRCH  (15) . 
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